Introduction

This Report is trying to comprise the exploratory and visualisation of The Center for Policing Equity (CPE) data set. The data frame is about the crime rate record of 2016 happening in the united state, and it includes the information of the police officer who dealt with the case and the subject(crime suspect). moreover, it also provides the information about the place where exactly the crime happened.

This report has used following libraries for completion of this report; dplyr, tidyr, lubridate, ggplot2, COUNT,gridExtra, plotly, htmltools, spatial and leaflet.

Data Wrangling

In the process of data wrangling, we first separate the rows of UOF numbers, and then we delete two columns, STREET_NAME and STREET_DIRECTION, because they have more none and NA values. We deleted those variables which have almost 50% of missing values.

From the variable INCIDENT_TIME, SUBJECT_DESCRIPTION and SUBJECT_GENDER, filter the NULL values, and from the variable OFFICER_ID and SUBJECT_ID, we remove the “0” values.

We also change the date month wise from the data set and divide time into four categories “Morning, evening, afternoon and Night.

## [1] "character"

Explore Data

Let’s explore the data that we just loaded into R. First, find out the names of the variables:

##  [1] "INCIDENT_DATE"           "INCIDENT_TIME"          
##  [3] "UOF_NUMBER"              "OFFICER_ID"             
##  [5] "OFFICER_GENDER"          "OFFICER_RACE"           
##  [7] "OFFICER_HIRE_DATE"       "OFFICER_YEARS_ON_FORCE" 
##  [9] "OFFICER_INJURY"          "OFFICER_INJURY_TYPE"    
## [11] "OFFICER_HOSPITALIZATION" "SUBJECT_ID"             
## [13] "SUBJECT_RACE"            "SUBJECT_GENDER"         
## [15] "SUBJECT_INJURY"          "SUBJECT_INJURY_TYPE"    
## [17] "SUBJECT_WAS_ARRESTED"    "SUBJECT_DESCRIPTION"    
## [19] "SUBJECT_OFFENSE"         "REPORTING_AREA"         
## [21] "BEAT"                    "SECTOR"                 
## [23] "DIVISION"                "LOCATION_DISTRICT"      
## [25] "STREET_NUMBER"           "LOCATION_CITY"          
## [27] "LOCATION_STATE"          "LOCATION_LATITUDE"      
## [29] "LOCATION_LONGITUDE"      "INCIDENT_REASON"        
## [31] "REASON_FOR_FORCE"        "TYPE_OF_FORCE_USED1"    
## [33] "FORCE_EFFECTIVE"

After performing the data wrangling on the complete data set, we have come up with 33 variables and 5070 observations.

Data Exploratory Analysis

Most of the officers are newly hired and have less than five years of experience in the force.

From the above graph, we can observe the percentage of the officers who investigate the crime have a White Race.

The graph represents that most people involved in the crime were male and have a black race. It also defines the unknown values, which explain the people’s gender besides male and female.

The above graph explains that most of the officers who dealt with the case were male and had a white race, and the second most common race found is Hispanic.

We found out that most males committed the crime compared to females, and less females were arrested after committing the crime than men.

From the above graph, we can say that most of the incidents happened inTaxes in February 2016,which were 583.

The above graph represents that most of the crimes happened at the nighttime. We make a limit of starting of the night time from 7:00 pm till 3:00 am.

We developed a graph of the subject description, recorded by the officer and the city’s division where the crime happened. From the interactive graph, we find out that most crimes occurred in central taxes. The reason was Alcohol o unknown drugs were used, and it comprises that Taxes is famous for alcohol and drugs crime.

We found that officers mostly arrest the subject based on the verbal command(Type of force used). We analysed that most of the subjects accompanying by a weapon, and the second most type of force was used due to the weapon displayed at the person or subject.

The graph represents that the male black race people are injured in the crime. In the case of the female subject, black race women were also injured during the crime.

To check the longitude and latitude of the crime locations, we develop the above graph that connects with the crime location’s area.

Crime location clusters have been created, which identifies the different crime location.

Conclusion

This report is about the exploratory and visualisation analysis of the given data of the police department. We dig out plenty of information from the data set like most officers working in a department have 2-3 years of experience and have a Hispanic race. The second common race is black. Most of the people who committed crimes are also of the Hispanic ethnicity. In the forces department, primarily women are working, and the subject who commit more crimes are males.

We also found out that most of the crimes happened in the nighttime in Taxes state in the Central division, and the reason is most of the crime was Alcohol and Unknown drugs used. Our analysis also comprises the latitude and longitude of the crime location, which explains the clear picture of the incident.